Cataloguing and Linking Life Sciences LOD Cloud
نویسندگان
چکیده
The Life Sciences Linked Open Data (LSLOD) Cloud is currently comprised of multiple datasets that add high value to biomedical research. The ability to navigate through these datasets in order to derive and discover new meaningful biological correlations is considered one of the most significant resources for supporting clinical decision making. However, navigating these multiple datasets is not easy as most of them are fragmented across multiple SPARQL endpoints, each containing trillions of triples and represented with insufficient vocabulary reuse. To retrieve and match, from multiple endpoints, the data required to answer meaningful biological questions, it is first necessary to catalogue the data represented in each endpoint, in order to understand how powerful queries traversing several SPARQL endpoints can be assembled. In this report, we explore the schema used to represent data from a total of 52 meaningful Life Sciences SPARQL endpoints and present our methodology for linking related concepts and properties from the “pool” of available elements. We found the outcome of this exploratory work not only to be helpful in identifying redundancy and gaps in the data, but also for enabling the assembly of complex federated queries. In this report we present three different approaches used to weave concepts and properties and discuss their applicability for creating complex links in the LSLOD cloud.
منابع مشابه
Towards Automatic Topical Classification of LOD Datasets
The datasets that are part of the Linking Open Data cloud diagramm (LOD cloud) are classified into the following topical categories: media, government, publications, life sciences, geographic, social networking, user-generated content, and cross-domain. The topical categories were manually assigned to the datasets. In this paper, we investigate to which extent the topical classification of new ...
متن کاملDetecting Inner-Ear Anatomical and Clinical Datasets in the Linked Open Data (LOD) Cloud
Linked Open Data (LOD) Cloud is a mesh of open datasets coming from different domains. Among these datasets, a notable amount of datasets belong to the life sciences domain linked together forming an interlinked “Life Sciences Linked Open Data (LSLOD) Cloud”. One of the key challenges for data publishers is to identify and establish links between newly generated domain specific datasets and LSL...
متن کاملA - Posteriori Integration for Life Sciences Data
Multiple datasets that add high value to biomedical research have been exposed on the web as part of the Life Sciences Linked Open Data (LS-LOD) Cloud. The ability to easily navigate through these datasets is crucial in order to draw meaningful biological co relations. However, navigating these multiple datasets is not trivial as most of these are only available as isolated SPARQL endpoints wit...
متن کاملChallenges for Semantically Driven Collaborative Spaces
Linked Data initiatives have fostered the publication of more than one thousand of datasets in the Linking Open Data (LOD) cloud from a large variety of domains, e.g., Life Sciences, Media, and Government. Albeit large in volume, Linked Data is essentially read-only and most collaborative tasks of cleaning, enriching, and reasoning are not dynamically available. Collaboration between data produ...
متن کاملHow to semantically relate dialectal Dictionaries in the Linked Data Framework
We describe on-going work towards publishing language resources included in dialectal dictionaries in the Linked Open Data (LOD) cloud, and so to support wider access to the diverse cultural data associated with such dictionary entries, like the various historical and geographical variations of the use of such words. Beyond this, our approach allows the cross-linking of entries of dialectal dic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012